Mining Consumer Health Vocabulary from Community-Generated Text
نویسندگان
چکیده
Community-generated text corpora can be a valuable resource to extract consumer health vocabulary (CHV) and link them to professional terminologies and alternative variants. In this research, we propose a pattern-based text-mining approach to identify pairs of CHV and professional terms from Wikipedia, a large text corpus created and maintained by the community. A novel measure, leveraging the ratio of frequency of occurrence, was used to differentiate consumer terms from professional terms. We empirically evaluated the applicability of this approach using a large data sample consisting of MedLine abstracts and all posts from an online health forum, MedHelp. The results show that the proposed approach is able to identify synonymous pairs and label the terms as either consumer or professional term with high accuracy. We conclude that the proposed approach provides great potential to produce a high quality CHV to improve the performance of computational applications in processing consumer-generated health text.
منابع مشابه
MuEVo, a Breast Cancer Consumer Health Vocabulary Built Out of Web Forums
Semantically analyze patient-generated text from a biomedical perspective is challenging because of the vocabulary gap between patients and health professionals. The medical expertise and vocabulary is well formalized in standards terminologies and ontologies, which enable semantic analysis of expertgenerated text; however resources which formalize the vocabulary of health consumers (patients a...
متن کاملSequence Package Analysis: A New Method for Intelligent Mining of Patient Dialog, Blogs and Help-line Calls
The ambiguities, repetitions and ellipses commonly found in natural language dialog continue to hinder speech (and text) analytic mining programs that glean business intelligence data from consumer help-line calls, or extract important medical diagnostic information from doctor-patient interviews or consumer-generated health-related blogs. This poses an even greater problem when such mining pro...
متن کاملIncorporating expert terminology and disease risk factors into consumer health vocabularies.
It is well-known that the general health information seeking lay-person, regardless of his/her education, cultural background, and economic status, is not as familiar with-or comfortable using-the technical terms commonly used by healthcare professionals. One of the primary reasons for this is due to the differences in perspectives and understanding of the vocabulary used by patients and provid...
متن کاملCombining Text Mining and Data Visualization Techniques to Understand Consumer Experiences of Electronic Cigarettes and Hookah in Online Forums
Introduction Since their introduction to the US market in 2007, electronic cigarettes (e-cigarettes) have posed considerable challenges to both public health authorities and government regulators, especially given the debate – in both the scientific world and the community at large – regarding the potential advantages (e.g. helping individuals quit smoking) and disadvantages (e.g. renormalizing...
متن کاملA Web Application to Support Consumer Health Vocabulary Development
We describe a Web application that supports collaborative development of a consumer health vocabulary. It performs text analyses and enables distributed human review. It also provides on-the-fly summary reports and facilitates the generation of a final vocabulary based on the results of the review.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- AMIA ... Annual Symposium proceedings. AMIA Symposium
دوره 2014 شماره
صفحات -
تاریخ انتشار 2014